Discovering Empirical Equations from Robot-Collected Data

نویسندگان

  • Kuang-Ming Huang
  • Jan M. Zytkow
چکیده

Discovery of multidimensional empirical equations has been a task of systems such as BACON and FAHRENHEIT.When confronted with data collected in a robotic experiment, BACON-like generalization mechanism of FAHRENHEIT reached an impasse because it found many acceptable equations for some datasets while none for others. We describe an improved generalization mechanism that handles both problems. We apply that mechanism to a robot arm experiment similar to Galileo's experiments with the inclined plane. The system collected data, determined empirical error and eventually found empirical equations acceptable within error. By confronting empirical equations developed by FAHRENHEIT with theoretical models based on classical mechanics, we have shown that empirical equations provide superior t to data. Systematic deviations between data and a theoretical model hint at processes not captured by the model but accounted for in empirical equations. 1 Robotic experiment and challenges of real data We describe a discovery mechanism that makes several improvements over the BACON-like search for multidimensional empirical equations. It has been motivated by an impasse reached by the existing systems, such as BACON and FAHRENHEIT, applied to data produced automatically in robotic experiments. 1.1 Challenges to BACON-like search for equations BACON-like search for multidimensional equations proceeds step by step, adding one independent variable at a time. Suppose the experimenter collects data by setting the values of three variables x1, x2, and x3, and measuring y. The nal goal is an equation that combines y; x1; x2; x3. It can often be expressed in a form convenient for predictions: y = f(x1; x2; x3). At any step of BACON's generalization from data to equations, independent variables can be divided in three categories: (1) those that have already been used in an equation (for example, x1), (2) one variable that is being added (for example, x2), and (3) those variables which have been kept constant in all experiments (x3). Generalization to x2 is triggered when an equation has been found for y and x1. The generalization process starts from data collection: for each value of x2 an equation is sought for y and x1. Data collection is successful when BACON reaches an externally selected number n of equations of the same algebraic form f(y; x1; A1; : : :Ak) = 0, which may have di erent values of parameters A1; : : :Ak. Each equation corresponds to one value of x2. The following table summarizes the \higher level" data used for generalization. Each aij represents a j-th value of parameter Ai: x2 A1 . . . Ak v1 a11 . . . ak1 v2 a12 . . . ak2 . . . . . . . . . . . . . . . . . . . . . . . . vn a1n . . . akn Now the task is to nd k equations of the form Ai = gi(x2), i = 1; : : : ; k that link x2 with each of A1; : : :Ak, one at a time. Those equations can be used to eliminate A1; : : :Ak in the original equation y = f(x1; A1; : : :Ak). As a result, the equation for y uses two independent variables x1 and x2 plus several new parameters B1; : : :Bm, which can be used to generalize the equation to the next independent variable. BACON works ne when it discovers a single equation per dataset, the same for all data. Langley et al. (1987) o er many examples of successful search, but pay little attention to the many ways in which the BACON search may fail. Even a simple failure at one of many steps causes the whole system to fail and halt. In this paper we describe solutions to the following problems: 1. Search results in multiple alternative equations, that o er acceptable t to data. If equations of di erent form are best for di erent datasets, which equation should be selected for generalization? 2. No equation of common form is acceptable for each dataset. This prevents a meaningful generalization, as all values of any given parameter Ai such as the slope of linear equation, must have the same meaning, so the induction over di erent values makes sense. 3. Equations can be evaluated when the measurement error is known. In BACON, the value of error is provided from the outside, but it should be determined by experiments since it is speci c to a given experiment setup. Evaluation may be overly demanding or too permissive if the system uses wrong values of error. A discovery method should use data to infer the error and then propagate the error so that it applies at all stages of the discovery process. 1.2 Experiment setup Let us distinguish between two meanings of experiment. In the rst meaning, an experiment includes the investigated empirical system S, the manipulating and measuring equipment and the strategy of using them. We shall call it a setup experiment. In the second meaning, an experiment is a single cycle of interaction between the experimenter and empirical system S. The cycle consists of creating a particular state of S, determined by the values of some variables, and in measuring the response of S in terms of other variables. In this paper we consider a discovery system that performs many experiments in the second sense, by varying objects and their properties within the xed setup.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

On the Performance of Different Empirical Loss Equations for Flow Through Coarse Porous Media (RESEARCH NOTE)

In this paper, the empirical equations that estimate hydraulic parameters for non-linear flow through coarse porous media are evaluated using a series of independent data collected in the laboratory. In this regard, three different relatively uniform soils ranging in size from 8.5 to 27.6 mm have been selected and three random samples drawn from each material. The physical characteristics such ...

متن کامل

Evaluation of Empirical Correlations for Predicting Gas Hydrate Formation Temperature

One of the important, practical and simple methods for hydrate formation conditionis empirical equations, and so far many empirical equations have been presented to predict thetemperature and pressure of hydrate formation. In this study, the methods and empiricalcorrelations have been reviewed and their predictive capabilities have been evaluated with ...

متن کامل

Modeling a Robot with Flexible Joints and Decoupling its Equations of Motion

Recently a method has been developed to decouple the equations of motion for multi-rigid body systems. In this paper, the method is first studied, then the equations of motion for a planar two degree-of-freedom robot with flexible joints are carried out using Lagaranges equations and Kanes equation with congruency transformations. Finally, the results obtained from both methods are throroughly ...

متن کامل

Modeling a Robot with Flexible Joints and Decoupling its Equations of Motion

Recently a method has been developed to decouple the equations of motion for multi-rigid body systems. In this paper, the method is first studied, then the equations of motion for a planar two degree-of-freedom robot with flexible joints are carried out using Lagarange's equations and Kane's equation with congruency transformations. Finally, the results obtained from both methods are throroughl...

متن کامل

Discovering Admissible Simultaneous Equation Models from Observed Data

Conventional work on scienti c discovery such as BACON derives empirical law equations from experimental data. In recent years, SDS introducing mathematical admissibility constraints has been proposed to discover rst principle based law equations, and it has been further extended to discover law equations from passively observed data. Furthermore, SSF has been proposed to discover the structure...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 1997